On Scheduling Mesh-Structured Computations on Unreliable Computers on the Internet
نویسنده
چکیده
This work studies the problem of improving the effectiveness of computing dependent tasks over the Internet. The distributed system is composed of a reliable server that coordinates the computation of a massive number of unreliable workers. It is known that the server cannot always ensure that the result of a task is correct without computing the task itself. This fact has significant impact on computing interdependent tasks. Since the computational capacity of the server may be restricted, along with the time to complete to the computation, the server may be able to compute only selected tasks, without knowing whether the remaining tasks were computed by workers correctly. But an incorrectly computed task may render the results of all dependent tasks incorrect. Thus it may become important for the server to compute judiciously selected tasks, so as to maximize the number of correct results. We assume that any worker computes correctly with probability $p<1$. Any incorrectly computed task corrupts all dependent tasks. The goal is to determine which tasks should be computed by the (reliable) server and which by the (unreliable) workers, and when, so as to maximize the expected number of correct results, under a constraint $d$ on the computation time. We show that this optimization problem is NP-hard. Then we study optimal scheduling algorithms for the mesh with the tightest deadline. We present combinatorial arguments that completely describe optimal solutions for two ranges of values of worker reliability $p$, when $p$ is close to zero and when $p$ is close to one. This research is joint work with Grzegorz Malewicz. Its extended abstract will appear in OPODIS’04.
منابع مشابه
Extended Abstract: "No-Compile-Time Knowledge" Distribution of Finite Element Computations on Multiprocessors
This paper addresses partitioning and scheduling of irregular loops arising in finite element computations on unstructured meshes. Target computers are Distributed Memory Parallel Computers that provide a global address space. We introduce the concept of “)conditioned Iterations Loop” which distributes the iterations dynamically according to a runtime condition. This technique is improved by a ...
متن کاملOn Scheduling Collaborative Computations on the Internet, I: Mesh-Dags and Their Close Relatives
Advancing technology has rendered the Internet a viable medium for collaborative computing, via mechanisms such as Web-Based Computing and Grid-Computing. We present a “pebble game” that abstracts the process of scheduling a computation-dag for computing over the Internet, including a novel formal criterion for comparing the qualities of competing schedules. Within this formal setting, we ident...
متن کاملOn Batch-Scheduling Dags for Internet-Based Computing
The process of scheduling computations for Internet-based computing presents challenges not encountered with more traditional platforms for parallel and distributed computing. The looser coupling among participating computers makes it harder to utilize remote clients well and also raise the specter of a kind of “gridlock” that ensues when a computation stalls because no new tasks are eligible f...
متن کاملComputations of Unsteady Viscous Compressible Flows Using Adaptive Mesh Refinement in Curvilinear Body-Fitted Grid Systems
A methodology for accurate ana efficient simulation of unsteady, compressible flows is presented. The cornerstones of the methodology are a. special discretization of the N avier-Stokes equa.tions on structured bodyfitted grid systems and an efficient solution-adaptive mesh refinement technique for structured grids. The discretization employs an explicit multidimensional upwind scheme for the i...
متن کاملUse of Computer and Internet in Agricultural Extension as perceived by Extension Workers
The purpose of this study was to determine computer and Internet use in agricultural extension by Extension Workers (EWs). This study used a descriptive-correlational design.Population for the study consisted of all extension workers (N= 320) in Isfahan Province, Iran. A stratified sampling technique and census was used to select EWs (n = 200). Overall, findings indicate that EWs have access to...
متن کامل